An auditory-based distortion measure with application to concatenative speech synthesis
نویسندگان
چکیده
This study presents a new auditory-based distance measure with application to concatenative speech synthesis. This measure employs the Carney auditory model to produce a feature vector related to auditory perception. For concatenative synthesis, the new measure is employed to assess perceived discontinuities at segment transitions. Evaluations using a restricted data base environment show that the new measure can be effective in improving speech synthesis performance.
منابع مشابه
Feature extraction by auditory modeling for unit selection in concatenative speech synthesis
A comprehensive computational model of the human auditory peripherals was applied to extract basic features of speech sounds. The auditory model extracts features by the auditory temporal coding mechanism in addition to features by the auditory place coding mechanism which has traditionally been used as spectral features. It also considers the nonlinearity of human auditory responses. Several s...
متن کاملFSM and k-nearest-neighbor for corpus based video-realistic audio-visual synthesis
In this paper we introduce a corpus based 2D videorealistic audio-visual synthesis system. The system combines a concatenative Text-to-Speech (TTS) System with a concatenative Text-to-Visual (TTV) System to an audio lipmovement synchronized Text-to-Audio-Visual-Speech System (TTAVS). For the concatenative TTS we are using a Finite State Machine approach to select non-uniform variablesize audio ...
متن کاملPower Spectral Densit Equalization of Large Sp Concatenative T
This paper proposes a channel equalization algorithm for a large speech database with application in concatenative TTS systems. The convolutional channel distortion is equalized by comparing the power spectral densities (PSDs) of utterances of different recording sessions. Autoregressive linear filters are designed on a corpus level and are used offline to filter the corresponding sentences to ...
متن کاملSynchronization of speech frames based on phase data with application to concatenative speech synthesis
Synchronization of speech frames is an important issue in a concatenative speech synthesis system. In terms of signal processing this is translated in removing linear phase mismatches between concatenated speech frames. This paper presents two novel approaches to the problem of synchronization of speech frames with an application to concatenative speech synthesis. Both methods are based on a pr...
متن کاملOptimized stopping criteria for tree-based unit selection in concatenative synthesis
The lack of naturalness hampers the widespread application of speech synthesis. Increasing the size of the unit database in a concatenative speech synthesizer has been proposed as a method to increase the variety of units—thereby improving naturalness. However, expanding the unit database increases the computational cost of selecting the most appropriate unit and compounds the risk that a perce...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Speech and Audio Processing
دوره 6 شماره
صفحات -
تاریخ انتشار 1998